Reduce whisper decoder file size with onnx export #328

csukuangfj · 2023-09-20T11:25:44Z

Before this pull-request:

-rw-r--r--  1 fangjun  staff   105M Aug  7 16:22 tiny.en-decoder.int8.onnx
-rw-r--r--  1 fangjun  staff   185M Sep 20 19:13 tiny.en-decoder.onnx

With this pull-request:

-rw-r--r--  1 fangjun  staff    86M Sep 20 19:09 tiny.en-decoder.int8.onnx
-rw-r--r--  1 fangjun  staff   109M Sep 20 19:09 tiny.en-decoder.onnx

It turns out onnx saves a transposed version of self.textDecoder.token_embedding.weight at the output when computing logits. This PR removes the transposed version to reduce the file size.

https://github.com/openai/whisper/#available-models-and-languages
says tiny.en has 39 M parameters, whose file size is about 39 * 1e6 * 4 /1024/1024 = 148.77 MB.

Our exported ONNX model file sizes for float32 are

-rw-r--r--  1 fangjun  staff   109M Sep 20 19:31 tiny.en-decoder.onnx
-rw-r--r--  1 fangjun  staff    36M Aug  7 16:22 tiny.en-encoder.onnx

109 + 36 = 145 MB, which matches the expected file size 148.77 MB.

csukuangfj · 2023-09-20T11:39:50Z

Please re-download the whisper onnx model or re-export it with the latest code.

Reduce whisper decoder file size with onnx export

c1d31ea

csukuangfj merged commit f5c060d into k2-fsa:master Sep 20, 2023
99 of 109 checks passed

csukuangfj deleted the fix-whisper branch September 20, 2023 11:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce whisper decoder file size with onnx export #328

Reduce whisper decoder file size with onnx export #328

csukuangfj commented Sep 20, 2023 •

edited

Loading

csukuangfj commented Sep 20, 2023

Reduce whisper decoder file size with onnx export #328

Reduce whisper decoder file size with onnx export #328

Conversation

csukuangfj commented Sep 20, 2023 • edited Loading

csukuangfj commented Sep 20, 2023

csukuangfj commented Sep 20, 2023 •

edited

Loading